Overview

Dataset statistics

Number of variables17
Number of observations10000
Missing cells2485
Missing cells (%)1.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory144.0 B

Variable types

Text8
Numeric7
Categorical1
DateTime1

Alerts

Id is highly overall correlated with VoteCountHigh correlation
VoteCount is highly overall correlated with Id and 2 other fieldsHigh correlation
Budget is highly overall correlated with VoteCount and 1 other fieldsHigh correlation
Revenue is highly overall correlated with VoteCount and 1 other fieldsHigh correlation
OriginalLanguage is highly imbalanced (69.2%)Imbalance
TagLine has 2413 (24.1%) missing valuesMissing
Popularity is highly skewed (γ1 = 20.16937129)Skewed
Id has unique valuesUnique
VoteAverage has 261 (2.6%) zerosZeros
VoteCount has 260 (2.6%) zerosZeros
Budget has 4472 (44.7%) zerosZeros
RunTime has 137 (1.4%) zerosZeros
Revenue has 4155 (41.5%) zerosZeros

Reproduction

Analysis started2023-12-10 15:07:27.447617
Analysis finished2023-12-10 15:07:38.593574
Duration11.15 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2278
Distinct (%)22.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:38.679341image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length43
Median length36
Mean length11.8537
Min length2

Characters and Unicode

Total characters118537
Distinct characters14
Distinct categories5 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1416 ?
Unique (%)14.2%

Sample

1st row[28, 12, 53]
2nd row[28, 53, 80]
3rd row[16, 28, 14]
4th row[28, 53]
5th row[53, 18]
ValueCountFrequency (%)
18 3785
14.5%
35 3004
11.5%
28 2776
10.6%
53 2659
10.2%
12 1884
 
7.2%
10749 1576
 
6.0%
27 1556
 
5.9%
14 1343
 
5.1%
10751 1337
 
5.1%
16 1318
 
5.0%
Other values (10) 4943
18.9%
2023-12-10T10:07:38.938467image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 16181
13.7%
16181
13.7%
1 13312
11.2%
8 11217
9.5%
[ 10000
8.4%
] 10000
8.4%
5 7303
6.2%
2 6744
5.7%
7 6569
5.5%
3 6233
 
5.3%
Other values (4) 14797
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 66175
55.8%
Other Punctuation 16181
 
13.7%
Space Separator 16181
 
13.7%
Open Punctuation 10000
 
8.4%
Close Punctuation 10000
 
8.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 13312
20.1%
8 11217
17.0%
5 7303
11.0%
2 6744
10.2%
7 6569
9.9%
3 6233
9.4%
0 5381
8.1%
4 4029
 
6.1%
9 2771
 
4.2%
6 2616
 
4.0%
Other Punctuation
ValueCountFrequency (%)
, 16181
100.0%
Space Separator
ValueCountFrequency (%)
16181
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 10000
100.0%
Close Punctuation
ValueCountFrequency (%)
] 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 118537
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
, 16181
13.7%
16181
13.7%
1 13312
11.2%
8 11217
9.5%
[ 10000
8.4%
] 10000
8.4%
5 7303
6.2%
2 6744
5.7%
7 6569
5.5%
3 6233
 
5.3%
Other values (4) 14797
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 118537
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 16181
13.7%
16181
13.7%
1 13312
11.2%
8 11217
9.5%
[ 10000
8.4%
] 10000
8.4%
5 7303
6.2%
2 6744
5.7%
7 6569
5.5%
3 6233
 
5.3%
Other values (4) 14797
12.5%

Id
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean306598.19
Minimum5
Maximum1191902
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:39.076352image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile887.95
Q111391.25
median117257
Q3535427.75
95-th percentile1020601
Maximum1191902
Range1191897
Interquartile range (IQR)524036.5

Descriptive statistics

Standard deviation350362.44
Coefficient of variation (CV)1.1427414
Kurtosis-0.38422622
Mean306598.19
Median Absolute Deviation (MAD)116438.5
Skewness0.91953223
Sum3.0659819 × 109
Variance1.2275384 × 1011
MonotonicityNot monotonic
2023-12-10T10:07:39.228254image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
299054 1
 
< 0.1%
803279 1
 
< 0.1%
10407 1
 
< 0.1%
13341 1
 
< 0.1%
12689 1
 
< 0.1%
306745 1
 
< 0.1%
1175873 1
 
< 0.1%
24248 1
 
< 0.1%
901 1
 
< 0.1%
531 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
5 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
18 1
< 0.1%
19 1
< 0.1%
22 1
< 0.1%
ValueCountFrequency (%)
1191902 1
< 0.1%
1191885 1
< 0.1%
1191557 1
< 0.1%
1191556 1
< 0.1%
1191268 1
< 0.1%
1191086 1
< 0.1%
1190610 1
< 0.1%
1190581 1
< 0.1%
1190531 1
< 0.1%
1190476 1
< 0.1%

OriginalLanguage
Categorical

IMBALANCE 

Distinct50
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
en
7498 
ja
 
602
ko
 
319
fr
 
308
es
 
296
Other values (45)
977 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters20000
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.1%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowes

Common Values

ValueCountFrequency (%)
en 7498
75.0%
ja 602
 
6.0%
ko 319
 
3.2%
fr 308
 
3.1%
es 296
 
3.0%
zh 162
 
1.6%
it 152
 
1.5%
cn 135
 
1.4%
de 78
 
0.8%
ru 65
 
0.7%
Other values (40) 385
 
3.9%

Length

2023-12-10T10:07:39.355423image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
en 7498
75.0%
ja 602
 
6.0%
ko 319
 
3.2%
fr 308
 
3.1%
es 296
 
3.0%
zh 162
 
1.6%
it 152
 
1.5%
cn 135
 
1.4%
de 78
 
0.8%
ru 65
 
0.7%
Other values (40) 385
 
3.9%

Most occurring characters

ValueCountFrequency (%)
e 7890
39.5%
n 7692
38.5%
a 646
 
3.2%
j 602
 
3.0%
r 393
 
2.0%
o 352
 
1.8%
s 333
 
1.7%
k 327
 
1.6%
f 319
 
1.6%
t 287
 
1.4%
Other values (14) 1159
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 20000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 7890
39.5%
n 7692
38.5%
a 646
 
3.2%
j 602
 
3.0%
r 393
 
2.0%
o 352
 
1.8%
s 333
 
1.7%
k 327
 
1.6%
f 319
 
1.6%
t 287
 
1.4%
Other values (14) 1159
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 20000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 7890
39.5%
n 7692
38.5%
a 646
 
3.2%
j 602
 
3.0%
r 393
 
2.0%
o 352
 
1.8%
s 333
 
1.7%
k 327
 
1.6%
f 319
 
1.6%
t 287
 
1.4%
Other values (14) 1159
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 7890
39.5%
n 7692
38.5%
a 646
 
3.2%
j 602
 
3.0%
r 393
 
2.0%
o 352
 
1.8%
s 333
 
1.7%
k 327
 
1.6%
f 319
 
1.6%
t 287
 
1.4%
Other values (14) 1159
 
5.8%
Distinct9713
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:39.569853image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length104
Median length61
Mean length15.6792
Min length1

Characters and Unicode

Total characters156792
Distinct characters2095
Distinct categories21 ?
Distinct scripts17 ?
Distinct blocks24 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9453 ?
Unique (%)94.5%

Sample

1st rowExpend4bles
2nd rowThe Equalizer 3
3rd rowMortal Kombat Legends: Cage Match
4th rowMission: Impossible - Dead Reckoning Part One
5th rowNowhere
ValueCountFrequency (%)
the 2557
 
9.2%
of 730
 
2.6%
a 328
 
1.2%
2 273
 
1.0%
and 229
 
0.8%
in 226
 
0.8%
226
 
0.8%
to 162
 
0.6%
la 142
 
0.5%
3 104
 
0.4%
Other values (9629) 22756
82.1%
2023-12-10T10:07:39.975724image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17714
 
11.3%
e 14622
 
9.3%
a 9276
 
5.9%
o 8352
 
5.3%
n 7768
 
5.0%
r 7701
 
4.9%
i 7462
 
4.8%
t 6900
 
4.4%
s 5742
 
3.7%
l 4893
 
3.1%
Other values (2085) 66362
42.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 101826
64.9%
Uppercase Letter 22487
 
14.3%
Space Separator 17734
 
11.3%
Other Letter 9891
 
6.3%
Other Punctuation 2519
 
1.6%
Decimal Number 1257
 
0.8%
Dash Punctuation 310
 
0.2%
Modifier Letter 290
 
0.2%
Nonspacing Mark 184
 
0.1%
Spacing Mark 116
 
0.1%
Other values (11) 178
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
277
 
2.8%
258
 
2.6%
140
 
1.4%
118
 
1.2%
114
 
1.2%
111
 
1.1%
97
 
1.0%
93
 
0.9%
93
 
0.9%
92
 
0.9%
Other values (1738) 8498
85.9%
Lowercase Letter
ValueCountFrequency (%)
e 14622
14.4%
a 9276
 
9.1%
o 8352
 
8.2%
n 7768
 
7.6%
r 7701
 
7.6%
i 7462
 
7.3%
t 6900
 
6.8%
s 5742
 
5.6%
l 4893
 
4.8%
h 4891
 
4.8%
Other values (113) 24219
23.8%
Uppercase Letter
ValueCountFrequency (%)
T 2862
 
12.7%
S 1877
 
8.3%
M 1454
 
6.5%
B 1390
 
6.2%
A 1322
 
5.9%
D 1260
 
5.6%
C 1230
 
5.5%
L 1160
 
5.2%
P 1054
 
4.7%
H 993
 
4.4%
Other values (66) 7885
35.1%
Nonspacing Mark
ValueCountFrequency (%)
15
 
8.2%
15
 
8.2%
14
 
7.6%
11
 
6.0%
ి 10
 
5.4%
10
 
5.4%
10
 
5.4%
9
 
4.9%
8
 
4.3%
7
 
3.8%
Other values (28) 75
40.8%
Other Punctuation
ValueCountFrequency (%)
: 1095
43.5%
' 433
 
17.2%
. 270
 
10.7%
! 181
 
7.2%
, 162
 
6.4%
& 131
 
5.2%
59
 
2.3%
/ 40
 
1.6%
? 34
 
1.3%
25
 
1.0%
Other values (18) 89
 
3.5%
Spacing Mark
ValueCountFrequency (%)
43
37.1%
15
 
12.9%
ि 12
 
10.3%
9
 
7.8%
7
 
6.0%
ி 6
 
5.2%
ি 6
 
5.2%
5
 
4.3%
3
 
2.6%
2
 
1.7%
Other values (7) 8
 
6.9%
Decimal Number
ValueCountFrequency (%)
2 401
31.9%
3 210
16.7%
1 171
13.6%
0 149
 
11.9%
4 86
 
6.8%
5 57
 
4.5%
9 53
 
4.2%
7 44
 
3.5%
6 41
 
3.3%
8 37
 
2.9%
Other values (4) 8
 
0.6%
Math Symbol
ValueCountFrequency (%)
30
61.2%
× 6
 
12.2%
+ 4
 
8.2%
~ 4
 
8.2%
| 3
 
6.1%
1
 
2.0%
1
 
2.0%
Open Punctuation
ValueCountFrequency (%)
( 12
28.6%
[ 10
23.8%
8
19.0%
7
16.7%
3
 
7.1%
1
 
2.4%
1
 
2.4%
Close Punctuation
ValueCountFrequency (%)
) 12
28.6%
] 10
23.8%
8
19.0%
7
16.7%
3
 
7.1%
1
 
2.4%
1
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 294
94.8%
10
 
3.2%
2
 
0.6%
2
 
0.6%
2
 
0.6%
Other Number
ValueCountFrequency (%)
½ 6
54.5%
² 2
 
18.2%
³ 2
 
18.2%
1
 
9.1%
Modifier Letter
ValueCountFrequency (%)
287
99.0%
2
 
0.7%
ʻ 1
 
0.3%
Final Punctuation
ValueCountFrequency (%)
8
80.0%
1
 
10.0%
» 1
 
10.0%
Other Symbol
ValueCountFrequency (%)
6
75.0%
° 1
 
12.5%
1
 
12.5%
Letter Number
ValueCountFrequency (%)
3
42.9%
2
28.6%
2
28.6%
Space Separator
ValueCountFrequency (%)
17714
99.9%
  20
 
0.1%
Format
ValueCountFrequency (%)
2
66.7%
1
33.3%
Currency Symbol
ValueCountFrequency (%)
¢ 2
66.7%
$ 1
33.3%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
« 1
50.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 123292
78.6%
Common 22272
 
14.2%
Han 3728
 
2.4%
Katakana 2604
 
1.7%
Hangul 1610
 
1.0%
Hiragana 1269
 
0.8%
Cyrillic 989
 
0.6%
Devanagari 337
 
0.2%
Thai 323
 
0.2%
Telugu 138
 
0.1%
Other values (7) 230
 
0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
93
 
2.5%
93
 
2.5%
92
 
2.5%
50
 
1.3%
41
 
1.1%
41
 
1.1%
37
 
1.0%
33
 
0.9%
30
 
0.8%
27
 
0.7%
Other values (1040) 3191
85.6%
Hangul
ValueCountFrequency (%)
61
 
3.8%
34
 
2.1%
28
 
1.7%
27
 
1.7%
26
 
1.6%
25
 
1.6%
24
 
1.5%
24
 
1.5%
22
 
1.4%
21
 
1.3%
Other values (383) 1318
81.9%
Latin
ValueCountFrequency (%)
e 14622
 
11.9%
a 9276
 
7.5%
o 8352
 
6.8%
n 7768
 
6.3%
r 7701
 
6.2%
i 7462
 
6.1%
t 6900
 
5.6%
s 5742
 
4.7%
l 4893
 
4.0%
h 4891
 
4.0%
Other values (113) 45685
37.1%
Common
ValueCountFrequency (%)
17714
79.5%
: 1095
 
4.9%
' 433
 
1.9%
2 401
 
1.8%
- 294
 
1.3%
287
 
1.3%
. 270
 
1.2%
3 210
 
0.9%
! 181
 
0.8%
1 171
 
0.8%
Other values (75) 1216
 
5.5%
Katakana
ValueCountFrequency (%)
258
 
9.9%
140
 
5.4%
118
 
4.5%
114
 
4.4%
111
 
4.3%
97
 
3.7%
86
 
3.3%
75
 
2.9%
68
 
2.6%
66
 
2.5%
Other values (69) 1471
56.5%
Hiragana
ValueCountFrequency (%)
277
21.8%
61
 
4.8%
49
 
3.9%
47
 
3.7%
43
 
3.4%
40
 
3.2%
39
 
3.1%
36
 
2.8%
35
 
2.8%
34
 
2.7%
Other values (58) 608
47.9%
Cyrillic
ValueCountFrequency (%)
а 94
 
9.5%
о 85
 
8.6%
е 73
 
7.4%
и 68
 
6.9%
р 67
 
6.8%
н 61
 
6.2%
т 42
 
4.2%
к 41
 
4.1%
л 37
 
3.7%
с 32
 
3.2%
Other values (46) 389
39.3%
Devanagari
ValueCountFrequency (%)
43
 
12.8%
18
 
5.3%
17
 
5.0%
16
 
4.7%
15
 
4.5%
15
 
4.5%
14
 
4.2%
13
 
3.9%
13
 
3.9%
ि 12
 
3.6%
Other values (38) 161
47.8%
Thai
ValueCountFrequency (%)
25
 
7.7%
23
 
7.1%
21
 
6.5%
17
 
5.3%
15
 
4.6%
14
 
4.3%
14
 
4.3%
13
 
4.0%
13
 
4.0%
11
 
3.4%
Other values (38) 157
48.6%
Telugu
ValueCountFrequency (%)
15
 
10.9%
12
 
8.7%
ి 10
 
7.2%
9
 
6.5%
8
 
5.8%
8
 
5.8%
7
 
5.1%
6
 
4.3%
5
 
3.6%
5
 
3.6%
Other values (26) 53
38.4%
Arabic
ValueCountFrequency (%)
ا 12
15.2%
ر 7
 
8.9%
م 6
 
7.6%
ل 6
 
7.6%
س 5
 
6.3%
ف 4
 
5.1%
ب 4
 
5.1%
و 3
 
3.8%
ن 3
 
3.8%
ه 3
 
3.8%
Other values (16) 26
32.9%
Greek
ValueCountFrequency (%)
α 4
 
10.3%
ν 3
 
7.7%
μ 3
 
7.7%
ο 3
 
7.7%
ς 3
 
7.7%
υ 2
 
5.1%
ρ 2
 
5.1%
η 2
 
5.1%
έ 2
 
5.1%
λ 2
 
5.1%
Other values (13) 13
33.3%
Bengali
ValueCountFrequency (%)
ি 6
17.1%
3
 
8.6%
3
 
8.6%
3
 
8.6%
3
 
8.6%
2
 
5.7%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (10) 10
28.6%
Tamil
ValueCountFrequency (%)
6
16.2%
ி 6
16.2%
4
10.8%
3
8.1%
3
8.1%
2
 
5.4%
2
 
5.4%
2
 
5.4%
1
 
2.7%
1
 
2.7%
Other values (7) 7
18.9%
Kannada
ValueCountFrequency (%)
3
14.3%
2
9.5%
2
9.5%
ಿ 2
9.5%
2
9.5%
2
9.5%
2
9.5%
2
9.5%
1
 
4.8%
1
 
4.8%
Other values (2) 2
9.5%
Malayalam
ValueCountFrequency (%)
2
18.2%
ി 2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Inherited
ValueCountFrequency (%)
5
62.5%
̀ 2
 
25.0%
1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 144563
92.2%
CJK 3726
 
2.4%
Katakana 2950
 
1.9%
Hangul 1610
 
1.0%
Hiragana 1274
 
0.8%
Cyrillic 989
 
0.6%
None 658
 
0.4%
Devanagari 337
 
0.2%
Thai 323
 
0.2%
Telugu 138
 
0.1%
Other values (14) 224
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17714
 
12.3%
e 14622
 
10.1%
a 9276
 
6.4%
o 8352
 
5.8%
n 7768
 
5.4%
r 7701
 
5.3%
i 7462
 
5.2%
t 6900
 
4.8%
s 5742
 
4.0%
l 4893
 
3.4%
Other values (77) 54133
37.4%
Katakana
ValueCountFrequency (%)
287
 
9.7%
258
 
8.7%
140
 
4.7%
118
 
4.0%
114
 
3.9%
111
 
3.8%
97
 
3.3%
86
 
2.9%
75
 
2.5%
68
 
2.3%
Other values (71) 1596
54.1%
Hiragana
ValueCountFrequency (%)
277
21.7%
61
 
4.8%
49
 
3.8%
47
 
3.7%
43
 
3.4%
40
 
3.1%
39
 
3.1%
36
 
2.8%
35
 
2.7%
34
 
2.7%
Other values (59) 613
48.1%
None
ValueCountFrequency (%)
é 103
 
15.7%
ó 36
 
5.5%
è 35
 
5.3%
30
 
4.6%
í 25
 
3.8%
25
 
3.8%
  20
 
3.0%
á 20
 
3.0%
20
 
3.0%
à 19
 
2.9%
Other values (115) 325
49.4%
Cyrillic
ValueCountFrequency (%)
а 94
 
9.5%
о 85
 
8.6%
е 73
 
7.4%
и 68
 
6.9%
р 67
 
6.8%
н 61
 
6.2%
т 42
 
4.2%
к 41
 
4.1%
л 37
 
3.7%
с 32
 
3.2%
Other values (46) 389
39.3%
CJK
ValueCountFrequency (%)
93
 
2.5%
93
 
2.5%
92
 
2.5%
50
 
1.3%
41
 
1.1%
41
 
1.1%
37
 
1.0%
33
 
0.9%
30
 
0.8%
27
 
0.7%
Other values (1039) 3189
85.6%
Hangul
ValueCountFrequency (%)
61
 
3.8%
34
 
2.1%
28
 
1.7%
27
 
1.7%
26
 
1.6%
25
 
1.6%
24
 
1.5%
24
 
1.5%
22
 
1.4%
21
 
1.3%
Other values (383) 1318
81.9%
Devanagari
ValueCountFrequency (%)
43
 
12.8%
18
 
5.3%
17
 
5.0%
16
 
4.7%
15
 
4.5%
15
 
4.5%
14
 
4.2%
13
 
3.9%
13
 
3.9%
ि 12
 
3.6%
Other values (38) 161
47.8%
Thai
ValueCountFrequency (%)
25
 
7.7%
23
 
7.1%
21
 
6.5%
17
 
5.3%
15
 
4.6%
14
 
4.3%
14
 
4.3%
13
 
4.0%
13
 
4.0%
11
 
3.4%
Other values (38) 157
48.6%
Telugu
ValueCountFrequency (%)
15
 
10.9%
12
 
8.7%
ి 10
 
7.2%
9
 
6.5%
8
 
5.8%
8
 
5.8%
7
 
5.1%
6
 
4.3%
5
 
3.6%
5
 
3.6%
Other values (26) 53
38.4%
Arabic
ValueCountFrequency (%)
ا 12
15.2%
ر 7
 
8.9%
م 6
 
7.6%
ل 6
 
7.6%
س 5
 
6.3%
ف 4
 
5.1%
ب 4
 
5.1%
و 3
 
3.8%
ن 3
 
3.8%
ه 3
 
3.8%
Other values (16) 26
32.9%
Punctuation
ValueCountFrequency (%)
8
42.1%
2
 
10.5%
2
 
10.5%
2
 
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Tamil
ValueCountFrequency (%)
6
16.2%
ி 6
16.2%
4
10.8%
3
8.1%
3
8.1%
2
 
5.4%
2
 
5.4%
2
 
5.4%
1
 
2.7%
1
 
2.7%
Other values (7) 7
18.9%
Misc Symbols
ValueCountFrequency (%)
6
100.0%
Bengali
ValueCountFrequency (%)
ি 6
17.1%
3
 
8.6%
3
 
8.6%
3
 
8.6%
3
 
8.6%
2
 
5.7%
2
 
5.7%
1
 
2.9%
1
 
2.9%
1
 
2.9%
Other values (10) 10
28.6%
Kannada
ValueCountFrequency (%)
3
14.3%
2
9.5%
2
9.5%
ಿ 2
9.5%
2
9.5%
2
9.5%
2
9.5%
2
9.5%
1
 
4.8%
1
 
4.8%
Other values (2) 2
9.5%
Number Forms
ValueCountFrequency (%)
3
37.5%
2
25.0%
2
25.0%
1
 
12.5%
Malayalam
ValueCountFrequency (%)
2
18.2%
ി 2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Diacriticals
ValueCountFrequency (%)
̀ 2
100.0%
Latin Ext Additional
ValueCountFrequency (%)
2
100.0%
CJK Compat Forms
ValueCountFrequency (%)
1
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Distinct9946
Distinct (%)> 99.9%
Missing51
Missing (%)0.5%
Memory size156.2 KiB
2023-12-10T10:07:40.230017image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length1000
Median length648
Mean length276.2556
Min length11

Characters and Unicode

Total characters2748467
Distinct characters144
Distinct categories19 ?
Distinct scripts3 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9943 ?
Unique (%)99.9%

Sample

1st rowArmed with every weapon they can get their hands on and the skills to use them, The Expendables are the world’s last line of defense and the team that gets called when all other options are off the table. But new team members with new styles and tactics are going to give “new blood” a whole new meaning.
2nd rowRobert McCall finds himself at home in Southern Italy but he discovers his friends are under the control of local crime bosses. As events turn deadly, McCall knows what he has to do: become his friends' protector by taking on the mafia.
3rd rowIn 1980s Hollywood, action star Johnny Cage is looking to become an A-list actor. But when his costar, Jennifer, goes missing from set, Johnny finds himself thrust into a world filled with shadows, danger, and deceit. As he embarks on a bloody journey, Johnny quickly discovers the City of Angels has more than a few devils in its midst.
4th rowEthan Hunt and his IMF team embark on their most dangerous mission yet: To track down a terrifying new weapon that threatens all of humanity before it falls into the wrong hands. With control of the future and the world's fate at stake and dark forces from Ethan's past closing in, a deadly race around the globe begins. Confronted by a mysterious, all-powerful enemy, Ethan must consider that nothing can matter more than his mission—not even the lives of those he cares about most.
5th rowA young pregnant woman named Mia escapes from a country at war by hiding in a maritime container aboard a cargo ship. After a violent storm, Mia gives birth to the child while lost at sea, where she must fight to survive.
ValueCountFrequency (%)
the 25823
 
5.5%
a 20517
 
4.4%
to 15672
 
3.3%
and 13764
 
2.9%
of 12638
 
2.7%
in 8235
 
1.8%
his 7119
 
1.5%
is 6148
 
1.3%
her 4789
 
1.0%
with 4711
 
1.0%
Other values (33420) 349955
74.6%
2023-12-10T10:07:40.640124image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
459804
16.7%
e 268366
 
9.8%
t 182946
 
6.7%
a 179364
 
6.5%
i 159531
 
5.8%
n 159509
 
5.8%
o 157954
 
5.7%
s 147563
 
5.4%
r 145146
 
5.3%
h 116848
 
4.3%
Other values (134) 771436
28.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2147996
78.2%
Space Separator 459827
 
16.7%
Uppercase Letter 67665
 
2.5%
Other Punctuation 56668
 
2.1%
Dash Punctuation 8359
 
0.3%
Decimal Number 5892
 
0.2%
Final Punctuation 1175
 
< 0.1%
Open Punctuation 312
 
< 0.1%
Close Punctuation 312
 
< 0.1%
Initial Punctuation 158
 
< 0.1%
Other values (9) 103
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 268366
12.5%
t 182946
 
8.5%
a 179364
 
8.4%
i 159531
 
7.4%
n 159509
 
7.4%
o 157954
 
7.4%
s 147563
 
6.9%
r 145146
 
6.8%
h 116848
 
5.4%
l 90119
 
4.2%
Other values (47) 540650
25.2%
Uppercase Letter
ValueCountFrequency (%)
A 8120
 
12.0%
T 5570
 
8.2%
S 5511
 
8.1%
B 4252
 
6.3%
C 3995
 
5.9%
M 3993
 
5.9%
W 3803
 
5.6%
H 3300
 
4.9%
D 2843
 
4.2%
I 2735
 
4.0%
Other values (21) 23543
34.8%
Other Punctuation
ValueCountFrequency (%)
, 25528
45.0%
. 22381
39.5%
' 5636
 
9.9%
" 1356
 
2.4%
: 582
 
1.0%
? 407
 
0.7%
! 350
 
0.6%
; 176
 
0.3%
112
 
0.2%
/ 66
 
0.1%
Other values (5) 74
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 1394
23.7%
0 1194
20.3%
9 796
13.5%
2 623
10.6%
5 347
 
5.9%
3 332
 
5.6%
8 331
 
5.6%
7 310
 
5.3%
6 293
 
5.0%
4 272
 
4.6%
Dash Punctuation
ValueCountFrequency (%)
- 7471
89.4%
594
 
7.1%
294
 
3.5%
Final Punctuation
ValueCountFrequency (%)
1052
89.5%
121
 
10.3%
» 2
 
0.2%
Initial Punctuation
ValueCountFrequency (%)
120
75.9%
36
 
22.8%
« 2
 
1.3%
Format
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%
Space Separator
ValueCountFrequency (%)
459804
> 99.9%
  23
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 308
98.7%
[ 4
 
1.3%
Close Punctuation
ValueCountFrequency (%)
) 308
98.7%
] 4
 
1.3%
Currency Symbol
ValueCountFrequency (%)
$ 57
98.3%
£ 1
 
1.7%
Control
ValueCountFrequency (%)
14
93.3%
1
 
6.7%
Other Symbol
ValueCountFrequency (%)
10
83.3%
® 2
 
16.7%
Nonspacing Mark
ValueCountFrequency (%)
́ 2
66.7%
̈ 1
33.3%
Other Number
ValueCountFrequency (%)
¹ 1
50.0%
² 1
50.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2215661
80.6%
Common 532803
 
19.4%
Inherited 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 268366
12.1%
t 182946
 
8.3%
a 179364
 
8.1%
i 159531
 
7.2%
n 159509
 
7.2%
o 157954
 
7.1%
s 147563
 
6.7%
r 145146
 
6.6%
h 116848
 
5.3%
l 90119
 
4.1%
Other values (78) 608315
27.5%
Common
ValueCountFrequency (%)
459804
86.3%
, 25528
 
4.8%
. 22381
 
4.2%
- 7471
 
1.4%
' 5636
 
1.1%
1 1394
 
0.3%
" 1356
 
0.3%
0 1194
 
0.2%
1052
 
0.2%
9 796
 
0.1%
Other values (44) 6191
 
1.2%
Inherited
ValueCountFrequency (%)
́ 2
66.7%
̈ 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2745676
99.9%
Punctuation 2336
 
0.1%
None 438
 
< 0.1%
Letterlike Symbols 10
 
< 0.1%
Modifier Letters 3
 
< 0.1%
Diacriticals 3
 
< 0.1%
Alphabetic PF 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
459804
16.7%
e 268366
 
9.8%
t 182946
 
6.7%
a 179364
 
6.5%
i 159531
 
5.8%
n 159509
 
5.8%
o 157954
 
5.8%
s 147563
 
5.4%
r 145146
 
5.3%
h 116848
 
4.3%
Other values (75) 768645
28.0%
Punctuation
ValueCountFrequency (%)
1052
45.0%
594
25.4%
294
 
12.6%
121
 
5.2%
120
 
5.1%
112
 
4.8%
36
 
1.5%
3
 
0.1%
2
 
0.1%
1
 
< 0.1%
None
ValueCountFrequency (%)
é 204
46.6%
ō 25
 
5.7%
í 25
 
5.7%
  23
 
5.3%
á 22
 
5.0%
è 16
 
3.7%
ū 11
 
2.5%
ï 11
 
2.5%
ä 9
 
2.1%
ç 9
 
2.1%
Other values (33) 83
18.9%
Letterlike Symbols
ValueCountFrequency (%)
10
100.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 3
100.0%
Diacriticals
ValueCountFrequency (%)
́ 2
66.7%
̈ 1
33.3%
Alphabetic PF
ValueCountFrequency (%)
1
100.0%

Popularity
Real number (ℝ)

SKEWED 

Distinct8069
Distinct (%)80.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.248141
Minimum13.049
Maximum3741.062
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:40.785357image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum13.049
5-th percentile13.47095
Q115.59075
median20.096
Q330.3105
95-th percentile79.9818
Maximum3741.062
Range3728.013
Interquartile range (IQR)14.71975

Descriptive statistics

Standard deviation84.332838
Coefficient of variation (CV)2.4624063
Kurtosis603.66538
Mean34.248141
Median Absolute Deviation (MAD)5.4355
Skewness20.169371
Sum342481.41
Variance7112.0276
MonotonicityDecreasing
2023-12-10T10:07:40.923653image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.297 6
 
0.1%
15.499 6
 
0.1%
16.146 6
 
0.1%
13.112 5
 
0.1%
13.572 5
 
0.1%
13.84 5
 
0.1%
14.739 5
 
0.1%
13.67 5
 
0.1%
14.76 5
 
0.1%
13.287 5
 
0.1%
Other values (8059) 9947
99.5%
ValueCountFrequency (%)
13.049 3
< 0.1%
13.05 1
 
< 0.1%
13.051 3
< 0.1%
13.052 2
< 0.1%
13.054 3
< 0.1%
13.055 2
< 0.1%
13.056 1
 
< 0.1%
13.057 2
< 0.1%
13.059 1
 
< 0.1%
13.063 2
< 0.1%
ValueCountFrequency (%)
3741.062 1
< 0.1%
2471.515 1
< 0.1%
2223.43 1
< 0.1%
2032.927 1
< 0.1%
1627.678 1
< 0.1%
1594.559 1
< 0.1%
1521.075 1
< 0.1%
1469.177 1
< 0.1%
1315.518 1
< 0.1%
1304.978 1
< 0.1%
Distinct5947
Distinct (%)59.6%
Missing21
Missing (%)0.2%
Memory size156.2 KiB
Minimum1902-04-17 00:00:00
Maximum2027-05-05 00:00:00
2023-12-10T10:07:41.074244image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:41.221931image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Title
Text

Distinct9637
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:41.489203image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length104
Median length72
Mean length17.0992
Min length1

Characters and Unicode

Total characters170992
Distinct characters155
Distinct categories18 ?
Distinct scripts5 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9312 ?
Unique (%)93.1%

Sample

1st rowExpend4bles
2nd rowThe Equalizer 3
3rd rowMortal Kombat Legends: Cage Match
4th rowMission: Impossible - Dead Reckoning Part One
5th rowNowhere
ValueCountFrequency (%)
the 3405
 
11.2%
of 1037
 
3.4%
a 413
 
1.4%
and 338
 
1.1%
2 300
 
1.0%
in 299
 
1.0%
212
 
0.7%
to 210
 
0.7%
movie 167
 
0.5%
3 122
 
0.4%
Other values (7795) 23908
78.6%
2023-12-10T10:07:41.912295image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20411
 
11.9%
e 17408
 
10.2%
a 10643
 
6.2%
o 10266
 
6.0%
n 9243
 
5.4%
r 9058
 
5.3%
i 8877
 
5.2%
t 8378
 
4.9%
s 6611
 
3.9%
h 6310
 
3.7%
Other values (145) 63787
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 119406
69.8%
Uppercase Letter 26585
 
15.5%
Space Separator 20413
 
11.9%
Other Punctuation 2857
 
1.7%
Decimal Number 1286
 
0.8%
Dash Punctuation 341
 
0.2%
Other Letter 32
 
< 0.1%
Open Punctuation 21
 
< 0.1%
Close Punctuation 21
 
< 0.1%
Other Number 11
 
< 0.1%
Other values (8) 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 17408
14.6%
a 10643
 
8.9%
o 10266
 
8.6%
n 9243
 
7.7%
r 9058
 
7.6%
i 8877
 
7.4%
t 8378
 
7.0%
s 6611
 
5.5%
h 6310
 
5.3%
l 5621
 
4.7%
Other values (31) 26991
22.6%
Other Letter
ValueCountFrequency (%)
2
 
6.2%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
Other values (21) 21
65.6%
Uppercase Letter
ValueCountFrequency (%)
T 3506
 
13.2%
S 2362
 
8.9%
M 1816
 
6.8%
B 1705
 
6.4%
D 1521
 
5.7%
A 1503
 
5.7%
C 1489
 
5.6%
P 1252
 
4.7%
L 1219
 
4.6%
H 1125
 
4.2%
Other values (20) 9087
34.2%
Other Punctuation
ValueCountFrequency (%)
: 1476
51.7%
' 492
 
17.2%
. 282
 
9.9%
, 192
 
6.7%
! 169
 
5.9%
& 141
 
4.9%
/ 39
 
1.4%
? 36
 
1.3%
* 8
 
0.3%
" 6
 
0.2%
Other values (9) 16
 
0.6%
Decimal Number
ValueCountFrequency (%)
2 408
31.7%
3 217
16.9%
1 178
13.8%
0 162
 
12.6%
4 90
 
7.0%
5 56
 
4.4%
9 56
 
4.4%
7 44
 
3.4%
6 39
 
3.0%
8 36
 
2.8%
Other Number
ValueCountFrequency (%)
½ 6
54.5%
³ 2
 
18.2%
1
 
9.1%
² 1
 
9.1%
1
 
9.1%
Dash Punctuation
ValueCountFrequency (%)
- 326
95.6%
11
 
3.2%
4
 
1.2%
Space Separator
ValueCountFrequency (%)
20411
> 99.9%
2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 16
76.2%
[ 5
 
23.8%
Close Punctuation
ValueCountFrequency (%)
) 16
76.2%
] 5
 
23.8%
Math Symbol
ValueCountFrequency (%)
+ 4
80.0%
| 1
 
20.0%
Currency Symbol
ValueCountFrequency (%)
¢ 2
66.7%
$ 1
33.3%
Final Punctuation
ValueCountFrequency (%)
4
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̀ 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Format
ValueCountFrequency (%)
­ 1
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 145991
85.4%
Common 24967
 
14.6%
Han 26
 
< 0.1%
Hiragana 6
 
< 0.1%
Inherited 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 17408
 
11.9%
a 10643
 
7.3%
o 10266
 
7.0%
n 9243
 
6.3%
r 9058
 
6.2%
i 8877
 
6.1%
t 8378
 
5.7%
s 6611
 
4.5%
h 6310
 
4.3%
l 5621
 
3.9%
Other values (61) 53576
36.7%
Common
ValueCountFrequency (%)
20411
81.8%
: 1476
 
5.9%
' 492
 
2.0%
2 408
 
1.6%
- 326
 
1.3%
. 282
 
1.1%
3 217
 
0.9%
, 192
 
0.8%
1 178
 
0.7%
! 169
 
0.7%
Other values (42) 816
 
3.3%
Han
ValueCountFrequency (%)
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (16) 16
61.5%
Hiragana
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Inherited
ValueCountFrequency (%)
̀ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 170815
99.9%
None 117
 
0.1%
CJK 26
 
< 0.1%
Punctuation 22
 
< 0.1%
Hiragana 6
 
< 0.1%
Diacriticals 2
 
< 0.1%
Latin Ext Additional 2
 
< 0.1%
Modifier Letters 1
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20411
 
11.9%
e 17408
 
10.2%
a 10643
 
6.2%
o 10266
 
6.0%
n 9243
 
5.4%
r 9058
 
5.3%
i 8877
 
5.2%
t 8378
 
4.9%
s 6611
 
3.9%
h 6310
 
3.7%
Other values (76) 63610
37.2%
None
ValueCountFrequency (%)
é 49
41.9%
ó 8
 
6.8%
í 6
 
5.1%
½ 6
 
5.1%
á 5
 
4.3%
à 4
 
3.4%
¡ 4
 
3.4%
ā 3
 
2.6%
ñ 3
 
2.6%
¿ 3
 
2.6%
Other values (19) 26
22.2%
Punctuation
ValueCountFrequency (%)
11
50.0%
4
 
18.2%
4
 
18.2%
2
 
9.1%
1
 
4.5%
Diacriticals
ValueCountFrequency (%)
̀ 2
100.0%
Hiragana
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Latin Ext Additional
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
Other values (16) 16
61.5%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

VoteAverage
Real number (ℝ)

ZEROS 

Distinct73
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.36797
Minimum0
Maximum10
Zeros261
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:42.065135image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.5
Q16
median6.6
Q37.1
95-th percentile7.9
Maximum10
Range10
Interquartile range (IQR)1.1

Descriptive statistics

Standard deviation1.3807251
Coefficient of variation (CV)0.21682343
Kurtosis9.7538572
Mean6.36797
Median Absolute Deviation (MAD)0.6
Skewness-2.5954374
Sum63679.7
Variance1.9064017
MonotonicityNot monotonic
2023-12-10T10:07:42.209245image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.5 466
 
4.7%
6.8 448
 
4.5%
7 447
 
4.5%
6.3 438
 
4.4%
6.6 437
 
4.4%
6.7 430
 
4.3%
6.9 423
 
4.2%
6.4 415
 
4.2%
6.2 410
 
4.1%
6.1 389
 
3.9%
Other values (63) 5697
57.0%
ValueCountFrequency (%)
0 261
2.6%
1 10
 
0.1%
1.5 1
 
< 0.1%
2 7
 
0.1%
2.3 1
 
< 0.1%
2.5 1
 
< 0.1%
2.7 2
 
< 0.1%
2.8 1
 
< 0.1%
2.9 3
 
< 0.1%
3 10
 
0.1%
ValueCountFrequency (%)
10 10
0.1%
9.8 1
 
< 0.1%
9.5 2
 
< 0.1%
9.3 1
 
< 0.1%
9 8
 
0.1%
8.9 1
 
< 0.1%
8.8 5
 
0.1%
8.7 2
 
< 0.1%
8.6 6
 
0.1%
8.5 21
0.2%

VoteCount
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3543
Distinct (%)35.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1620.1538
Minimum0
Maximum34628
Zeros260
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:42.351845image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q1168
median563
Q31665
95-th percentile6937.25
Maximum34628
Range34628
Interquartile range (IQR)1497

Descriptive statistics

Standard deviation2960.6426
Coefficient of variation (CV)1.8273837
Kurtosis21.895306
Mean1620.1538
Median Absolute Deviation (MAD)488
Skewness4.0440426
Sum16201538
Variance8765404.7
MonotonicityNot monotonic
2023-12-10T10:07:42.490000image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 260
 
2.6%
1 72
 
0.7%
2 60
 
0.6%
3 49
 
0.5%
5 46
 
0.5%
4 45
 
0.4%
8 39
 
0.4%
7 36
 
0.4%
6 33
 
0.3%
10 30
 
0.3%
Other values (3533) 9330
93.3%
ValueCountFrequency (%)
0 260
2.6%
1 72
 
0.7%
2 60
 
0.6%
3 49
 
0.5%
4 45
 
0.4%
5 46
 
0.5%
6 33
 
0.3%
7 36
 
0.4%
8 39
 
0.4%
9 27
 
0.3%
ValueCountFrequency (%)
34628 1
< 0.1%
32726 1
< 0.1%
30768 1
< 0.1%
29904 1
< 0.1%
29241 1
< 0.1%
28971 1
< 0.1%
27827 1
< 0.1%
27366 1
< 0.1%
26721 1
< 0.1%
26016 1
< 0.1%

Budget
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct710
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20439089
Minimum0
Maximum4.6 × 108
Zeros4472
Zeros (%)44.7%
Negative0
Negative (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:42.636468image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2130000
Q325000000
95-th percentile1 × 108
Maximum4.6 × 108
Range4.6 × 108
Interquartile range (IQR)25000000

Descriptive statistics

Standard deviation38786607
Coefficient of variation (CV)1.8976681
Kurtosis13.84859
Mean20439089
Median Absolute Deviation (MAD)2130000
Skewness3.2401366
Sum2.0439089 × 1011
Variance1.5044009 × 1015
MonotonicityNot monotonic
2023-12-10T10:07:42.778634image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4472
44.7%
20000000 210
 
2.1%
30000000 184
 
1.8%
25000000 173
 
1.7%
10000000 169
 
1.7%
15000000 165
 
1.7%
40000000 157
 
1.6%
5000000 145
 
1.5%
50000000 133
 
1.3%
35000000 132
 
1.3%
Other values (700) 4060
40.6%
ValueCountFrequency (%)
0 4472
44.7%
1 3
 
< 0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
6 1
 
< 0.1%
7 1
 
< 0.1%
10 1
 
< 0.1%
15 1
 
< 0.1%
20 1
 
< 0.1%
26 1
 
< 0.1%
ValueCountFrequency (%)
460000000 1
 
< 0.1%
379000000 1
 
< 0.1%
365000000 1
 
< 0.1%
356000000 1
 
< 0.1%
340000000 1
 
< 0.1%
300000000 4
< 0.1%
297000000 1
 
< 0.1%
294700000 1
 
< 0.1%
291000000 1
 
< 0.1%
274800000 1
 
< 0.1%
Distinct8180
Distinct (%)81.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:42.962881image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length2591
Median length1060
Mean length324.7863
Min length2

Characters and Unicode

Total characters3247863
Distinct characters205
Distinct categories16 ?
Distinct scripts6 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7740 ?
Unique (%)77.4%

Sample

1st row[{'id': 1020, 'logo_path': '/kuUIHNwMec4dwOLghDhhZJzHZTd.png', 'name': 'Millennium Media', 'origin_country': 'US'}, {'id': 48738, 'logo_path': None, 'name': 'Campbell Grobman Films', 'origin_country': 'US'}, {'id': 1632, 'logo_path': '/cisLn1YAUuptXVBa0xjq7ST9cH0.png', 'name': 'Lionsgate', 'origin_country': 'US'}]
2nd row[{'id': 1423, 'logo_path': '/1rbAwGQzrNvXDICD6HWEn1YqfAV.png', 'name': 'Escape Artists', 'origin_country': 'US'}, {'id': 5, 'logo_path': '/wrweLpBqRYcAM7kCSaHDJRxKGOP.png', 'name': 'Columbia Pictures', 'origin_country': 'US'}, {'id': 10400, 'logo_path': '/9LlB2YAwXTkUAhx0pItSo6pDlkB.png', 'name': 'Eagle Pictures', 'origin_country': 'IT'}, {'id': 44967, 'logo_path': None, 'name': 'ZHIV Productions', 'origin_country': ''}]
3rd row[{'id': 2785, 'logo_path': '/l5zW8jjmQOCx2dFmvnmbYmqoBmL.png', 'name': 'Warner Bros. Animation', 'origin_country': 'US'}]
4th row[{'id': 4, 'logo_path': '/gz66EfNoYPqHTYI4q9UEN4CbHRc.png', 'name': 'Paramount', 'origin_country': 'US'}, {'id': 82819, 'logo_path': '/gXfFl9pRPaoaq14jybEn1pHeldr.png', 'name': 'Skydance', 'origin_country': 'US'}, {'id': 21777, 'logo_path': None, 'name': 'TC Productions', 'origin_country': 'US'}]
5th row[{'id': 204005, 'logo_path': None, 'name': 'Rock & Ruz', 'origin_country': 'ES'}]
ValueCountFrequency (%)
id 31298
 
10.7%
logo_path 31288
 
10.7%
name 31288
 
10.7%
origin_country 31288
 
10.7%
us 14093
 
4.8%
none 12436
 
4.3%
6894
 
2.4%
pictures 4744
 
1.6%
films 3648
 
1.3%
productions 3537
 
1.2%
Other values (21950) 120954
41.5%
2023-12-10T10:07:43.509400image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 413056
 
12.7%
281472
 
8.7%
o 178504
 
5.5%
n 171817
 
5.3%
i 143373
 
4.4%
: 125161
 
3.9%
, 115615
 
3.6%
t 107883
 
3.3%
r 105135
 
3.2%
a 102782
 
3.2%
Other values (195) 1503065
46.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1553183
47.8%
Other Punctuation 694096
21.4%
Uppercase Letter 355938
 
11.0%
Space Separator 281472
 
8.7%
Decimal Number 216171
 
6.7%
Connector Punctuation 62579
 
1.9%
Close Punctuation 41476
 
1.3%
Open Punctuation 41476
 
1.3%
Dash Punctuation 1007
 
< 0.1%
Math Symbol 297
 
< 0.1%
Other values (6) 168
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
5.6%
8
 
5.6%
8
 
5.6%
8
 
5.6%
5
 
3.5%
5
 
3.5%
5
 
3.5%
4
 
2.8%
3
 
2.1%
3
 
2.1%
Other values (71) 86
60.1%
Lowercase Letter
ValueCountFrequency (%)
o 178504
11.5%
n 171817
 
11.1%
i 143373
 
9.2%
t 107883
 
6.9%
r 105135
 
6.8%
a 102782
 
6.6%
g 93497
 
6.0%
e 93045
 
6.0%
p 63197
 
4.1%
u 58286
 
3.8%
Other values (41) 435664
28.0%
Uppercase Letter
ValueCountFrequency (%)
S 28655
 
8.1%
U 23570
 
6.6%
N 22843
 
6.4%
P 22036
 
6.2%
F 17544
 
4.9%
E 16361
 
4.6%
C 16186
 
4.5%
A 14254
 
4.0%
R 14248
 
4.0%
B 12993
 
3.7%
Other values (23) 167248
47.0%
Other Punctuation
ValueCountFrequency (%)
' 413056
59.5%
: 125161
 
18.0%
, 115615
 
16.7%
. 20518
 
3.0%
/ 19164
 
2.8%
& 333
 
< 0.1%
" 226
 
< 0.1%
! 16
 
< 0.1%
@ 4
 
< 0.1%
2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 30685
14.2%
2 23949
11.1%
4 21343
9.9%
3 21150
9.8%
5 20894
9.7%
9 20498
9.5%
8 20052
9.3%
0 19663
9.1%
6 19096
8.8%
7 18841
8.7%
Close Punctuation
ValueCountFrequency (%)
} 31288
75.4%
] 10000
 
24.1%
) 185
 
0.4%
3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
{ 31288
75.4%
[ 10000
 
24.1%
( 185
 
0.4%
3
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 1006
99.9%
1
 
0.1%
Other Symbol
ValueCountFrequency (%)
13
92.9%
1
 
7.1%
Space Separator
ValueCountFrequency (%)
281472
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 62579
100.0%
Math Symbol
ValueCountFrequency (%)
+ 297
100.0%
Other Number
ValueCountFrequency (%)
² 5
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Modifier Letter
ValueCountFrequency (%)
2
100.0%
Nonspacing Mark
ValueCountFrequency (%)
́ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1909121
58.8%
Common 1338597
41.2%
Han 105
 
< 0.1%
Hangul 31
 
< 0.1%
Katakana 8
 
< 0.1%
Inherited 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 178504
 
9.4%
n 171817
 
9.0%
i 143373
 
7.5%
t 107883
 
5.7%
r 105135
 
5.5%
a 102782
 
5.4%
g 93497
 
4.9%
e 93045
 
4.9%
p 63197
 
3.3%
u 58286
 
3.1%
Other values (74) 791602
41.5%
Han
ValueCountFrequency (%)
8
 
7.6%
8
 
7.6%
8
 
7.6%
8
 
7.6%
5
 
4.8%
5
 
4.8%
5
 
4.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
Other values (40) 48
45.7%
Common
ValueCountFrequency (%)
' 413056
30.9%
281472
21.0%
: 125161
 
9.4%
, 115615
 
8.6%
_ 62579
 
4.7%
} 31288
 
2.3%
{ 31288
 
2.3%
1 30685
 
2.3%
2 23949
 
1.8%
4 21343
 
1.6%
Other values (28) 202161
15.1%
Hangul
ValueCountFrequency (%)
3
 
9.7%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (14) 14
45.2%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Inherited
ValueCountFrequency (%)
́ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3246654
> 99.9%
None 1046
 
< 0.1%
CJK 105
 
< 0.1%
Hangul 30
 
< 0.1%
Letterlike Symbols 13
 
< 0.1%
Katakana 10
 
< 0.1%
Punctuation 4
 
< 0.1%
Diacriticals 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 413056
 
12.7%
281472
 
8.7%
o 178504
 
5.5%
n 171817
 
5.3%
i 143373
 
4.4%
: 125161
 
3.9%
, 115615
 
3.6%
t 107883
 
3.3%
r 105135
 
3.2%
a 102782
 
3.2%
Other values (72) 1501856
46.3%
None
ValueCountFrequency (%)
é 602
57.6%
í 51
 
4.9%
ä 48
 
4.6%
ñ 44
 
4.2%
É 38
 
3.6%
á 35
 
3.3%
ó 34
 
3.3%
ö 31
 
3.0%
ü 18
 
1.7%
ç 18
 
1.7%
Other values (27) 127
 
12.1%
Letterlike Symbols
ValueCountFrequency (%)
13
100.0%
CJK
ValueCountFrequency (%)
8
 
7.6%
8
 
7.6%
8
 
7.6%
8
 
7.6%
5
 
4.8%
5
 
4.8%
5
 
4.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
Other values (40) 48
45.7%
Punctuation
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
Hangul
ValueCountFrequency (%)
3
 
10.0%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
2
 
6.7%
1
 
3.3%
1
 
3.3%
1
 
3.3%
1
 
3.3%
Other values (13) 13
43.3%
Katakana
ValueCountFrequency (%)
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Diacriticals
ValueCountFrequency (%)
́ 1
100.0%
Distinct868
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:43.729740image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length517
Median length433
Mean length69.0281
Min length2

Characters and Unicode

Total characters690281
Distinct characters63
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique613 ?
Unique (%)6.1%

Sample

1st row[{'iso_3166_1': 'US', 'name': 'United States of America'}]
2nd row[{'iso_3166_1': 'IT', 'name': 'Italy'}, {'iso_3166_1': 'US', 'name': 'United States of America'}]
3rd row[{'iso_3166_1': 'US', 'name': 'United States of America'}]
4th row[{'iso_3166_1': 'US', 'name': 'United States of America'}]
5th row[{'iso_3166_1': 'ES', 'name': 'Spain'}]
ValueCountFrequency (%)
iso_3166_1 13839
17.8%
name 13839
17.8%
united 7991
10.3%
states 6750
8.7%
of 6750
8.7%
america 6750
8.7%
us 6750
8.7%
gb 1222
 
1.6%
kingdom 1222
 
1.6%
france 758
 
1.0%
Other values (220) 11956
15.4%
2023-12-10T10:07:44.110516image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 110712
16.0%
67827
 
9.8%
e 38199
 
5.5%
a 34681
 
5.0%
i 31757
 
4.6%
6 27678
 
4.0%
_ 27678
 
4.0%
: 27678
 
4.0%
1 27678
 
4.0%
n 27633
 
4.0%
Other values (53) 268760
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 264597
38.3%
Other Punctuation 156247
22.6%
Decimal Number 69195
 
10.0%
Space Separator 67827
 
9.8%
Uppercase Letter 57059
 
8.3%
Connector Punctuation 27678
 
4.0%
Close Punctuation 23839
 
3.5%
Open Punctuation 23839
 
3.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 15105
26.5%
S 14727
25.8%
A 8012
14.0%
K 2529
 
4.4%
C 1863
 
3.3%
G 1787
 
3.1%
B 1656
 
2.9%
F 1568
 
2.7%
R 1476
 
2.6%
J 1448
 
2.5%
Other values (16) 6888
12.1%
Lowercase Letter
ValueCountFrequency (%)
e 38199
14.4%
a 34681
13.1%
i 31757
12.0%
n 27633
10.4%
o 23538
8.9%
t 22642
8.6%
m 22606
8.5%
s 21117
8.0%
d 10373
 
3.9%
r 9279
 
3.5%
Other values (15) 22772
8.6%
Other Punctuation
ValueCountFrequency (%)
' 110712
70.9%
: 27678
 
17.7%
, 17857
 
11.4%
Decimal Number
ValueCountFrequency (%)
6 27678
40.0%
1 27678
40.0%
3 13839
20.0%
Close Punctuation
ValueCountFrequency (%)
} 13839
58.1%
] 10000
41.9%
Open Punctuation
ValueCountFrequency (%)
{ 13839
58.1%
[ 10000
41.9%
Space Separator
ValueCountFrequency (%)
67827
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 27678
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 368625
53.4%
Latin 321656
46.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 38199
11.9%
a 34681
10.8%
i 31757
9.9%
n 27633
 
8.6%
o 23538
 
7.3%
t 22642
 
7.0%
m 22606
 
7.0%
s 21117
 
6.6%
U 15105
 
4.7%
S 14727
 
4.6%
Other values (41) 69651
21.7%
Common
ValueCountFrequency (%)
' 110712
30.0%
67827
18.4%
6 27678
 
7.5%
_ 27678
 
7.5%
: 27678
 
7.5%
1 27678
 
7.5%
, 17857
 
4.8%
3 13839
 
3.8%
} 13839
 
3.8%
{ 13839
 
3.8%
Other values (2) 20000
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 690281
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 110712
16.0%
67827
 
9.8%
e 38199
 
5.5%
a 34681
 
5.0%
i 31757
 
4.6%
6 27678
 
4.0%
_ 27678
 
4.0%
: 27678
 
4.0%
1 27678
 
4.0%
n 27633
 
4.0%
Other values (53) 268760
38.9%
Distinct974
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:44.372511image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length725
Median length67
Mean length94.5997
Min length2

Characters and Unicode

Total characters945997
Distinct characters198
Distinct categories12 ?
Distinct scripts16 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique740 ?
Unique (%)7.4%

Sample

1st row[{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]
2nd row[{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Italian', 'iso_639_1': 'it', 'name': 'Italiano'}]
3rd row[{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]
4th row[{'english_name': 'French', 'iso_639_1': 'fr', 'name': 'Français'}, {'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Italian', 'iso_639_1': 'it', 'name': 'Italiano'}, {'english_name': 'Russian', 'iso_639_1': 'ru', 'name': 'Pусский'}]
5th row[{'english_name': 'Spanish', 'iso_639_1': 'es', 'name': 'Español'}]
ValueCountFrequency (%)
english 15714
18.3%
english_name 14176
16.5%
iso_639_1 14176
16.5%
name 14176
16.5%
en 7857
9.2%
spanish 895
 
1.0%
español 895
 
1.0%
es 895
 
1.0%
français 851
 
1.0%
french 851
 
1.0%
Other values (277) 15188
17.7%
2023-12-10T10:07:44.815919image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 170112
18.0%
75674
 
8.0%
n 74418
 
7.9%
e 57384
 
6.1%
s 51181
 
5.4%
i 49770
 
5.3%
: 42528
 
4.5%
_ 42528
 
4.5%
a 40466
 
4.3%
h 33241
 
3.5%
Other values (188) 308695
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 441401
46.7%
Other Punctuation 245822
26.0%
Space Separator 75674
 
8.0%
Decimal Number 56704
 
6.0%
Connector Punctuation 42528
 
4.5%
Uppercase Letter 25980
 
2.7%
Close Punctuation 24176
 
2.6%
Open Punctuation 24176
 
2.6%
Other Letter 9047
 
1.0%
Nonspacing Mark 254
 
< 0.1%
Other values (2) 235
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 74418
16.9%
e 57384
13.0%
s 51181
11.6%
i 49770
11.3%
a 40466
9.2%
h 33241
7.5%
l 32548
7.4%
g 30645
6.9%
m 29009
 
6.6%
o 17466
 
4.0%
Other values (67) 25273
 
5.7%
Other Letter
ValueCountFrequency (%)
801
 
8.9%
801
 
8.9%
801
 
8.9%
566
 
6.3%
438
 
4.8%
366
 
4.0%
366
 
4.0%
366
 
4.0%
366
 
4.0%
366
 
4.0%
Other values (49) 3810
42.1%
Uppercase Letter
ValueCountFrequency (%)
E 16635
64.0%
F 1722
 
6.6%
S 1031
 
4.0%
I 1022
 
3.9%
J 803
 
3.1%
P 765
 
2.9%
D 632
 
2.4%
G 558
 
2.1%
M 410
 
1.6%
K 385
 
1.5%
Other values (19) 2017
 
7.8%
Spacing Mark
ValueCountFrequency (%)
ि 82
35.3%
82
35.3%
24
 
10.3%
14
 
6.0%
7
 
3.0%
7
 
3.0%
ி 7
 
3.0%
7
 
3.0%
2
 
0.9%
Nonspacing Mark
ValueCountFrequency (%)
ִ 96
37.8%
82
32.3%
ְ 48
18.9%
12
 
4.7%
7
 
2.8%
7
 
2.8%
2
 
0.8%
Other Punctuation
ValueCountFrequency (%)
' 170112
69.2%
: 42528
 
17.3%
, 32581
 
13.3%
/ 585
 
0.2%
? 10
 
< 0.1%
; 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
9 14176
25.0%
1 14176
25.0%
3 14176
25.0%
6 14176
25.0%
Close Punctuation
ValueCountFrequency (%)
} 14176
58.6%
] 10000
41.4%
Open Punctuation
ValueCountFrequency (%)
{ 14176
58.6%
[ 10000
41.4%
Space Separator
ValueCountFrequency (%)
75674
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 42528
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 469083
49.6%
Latin 464699
49.1%
Han 4758
 
0.5%
Cyrillic 2299
 
0.2%
Hangul 2196
 
0.2%
Arabic 1061
 
0.1%
Devanagari 492
 
0.1%
Thai 448
 
< 0.1%
Hebrew 384
 
< 0.1%
Greek 320
 
< 0.1%
Other values (6) 257
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 74418
16.0%
e 57384
12.3%
s 51181
11.0%
i 49770
10.7%
a 40466
8.7%
h 33241
7.2%
l 32548
7.0%
g 30645
6.6%
m 29009
 
6.2%
o 17466
 
3.8%
Other values (60) 48571
10.5%
Cyrillic
ValueCountFrequency (%)
с 669
29.1%
к 383
16.7%
и 360
15.7%
й 340
14.8%
у 319
13.9%
а 37
 
1.6%
р 33
 
1.4%
н 22
 
1.0%
ь 22
 
1.0%
У 22
 
1.0%
Other values (12) 92
 
4.0%
Common
ValueCountFrequency (%)
' 170112
36.3%
75674
16.1%
: 42528
 
9.1%
_ 42528
 
9.1%
, 32581
 
6.9%
9 14176
 
3.0%
} 14176
 
3.0%
1 14176
 
3.0%
{ 14176
 
3.0%
3 14176
 
3.0%
Other values (7) 34780
 
7.4%
Arabic
ValueCountFrequency (%)
ا 162
15.3%
ر 162
15.3%
ة 129
12.2%
ي 129
12.2%
ع 129
12.2%
ب 129
12.2%
ل 129
12.2%
س 18
 
1.7%
ی 18
 
1.7%
ف 18
 
1.7%
Other values (5) 38
 
3.6%
Han
ValueCountFrequency (%)
801
16.8%
801
16.8%
801
16.8%
566
11.9%
438
9.2%
347
7.3%
347
7.3%
219
 
4.6%
219
 
4.6%
广 219
 
4.6%
Hebrew
ValueCountFrequency (%)
ִ 96
25.0%
ר 48
12.5%
ת 48
12.5%
י 48
12.5%
ע 48
12.5%
ְ 48
12.5%
ב 48
12.5%
Greek
ValueCountFrequency (%)
λ 80
25.0%
κ 40
12.5%
ι 40
12.5%
ν 40
12.5%
η 40
12.5%
ά 40
12.5%
ε 40
12.5%
Georgian
ValueCountFrequency (%)
9
14.3%
9
14.3%
9
14.3%
9
14.3%
9
14.3%
9
14.3%
9
14.3%
Hangul
ValueCountFrequency (%)
366
16.7%
366
16.7%
366
16.7%
366
16.7%
366
16.7%
366
16.7%
Thai
ValueCountFrequency (%)
128
28.6%
64
14.3%
64
14.3%
64
14.3%
64
14.3%
64
14.3%
Devanagari
ValueCountFrequency (%)
ि 82
16.7%
82
16.7%
82
16.7%
82
16.7%
82
16.7%
82
16.7%
Gurmukhi
ValueCountFrequency (%)
7
16.7%
7
16.7%
7
16.7%
7
16.7%
7
16.7%
7
16.7%
Telugu
ValueCountFrequency (%)
24
33.3%
12
16.7%
12
16.7%
12
16.7%
12
16.7%
Tamil
ValueCountFrequency (%)
7
20.0%
7
20.0%
ி 7
20.0%
7
20.0%
7
20.0%
Sinhala
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%
Bengali
ValueCountFrequency (%)
14
40.0%
7
20.0%
7
20.0%
7
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 931602
98.5%
CJK 4758
 
0.5%
None 2434
 
0.3%
Cyrillic 2299
 
0.2%
Hangul 2196
 
0.2%
Arabic 1061
 
0.1%
Devanagari 492
 
0.1%
Thai 448
 
< 0.1%
Hebrew 384
 
< 0.1%
Telugu 72
 
< 0.1%
Other values (6) 251
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 170112
18.3%
75674
 
8.1%
n 74418
 
8.0%
e 57384
 
6.2%
s 51181
 
5.5%
i 49770
 
5.3%
: 42528
 
4.6%
_ 42528
 
4.6%
a 40466
 
4.3%
h 33241
 
3.6%
Other values (58) 294300
31.6%
None
ValueCountFrequency (%)
ñ 895
36.8%
ç 890
36.6%
ê 146
 
6.0%
λ 80
 
3.3%
κ 40
 
1.6%
ι 40
 
1.6%
ν 40
 
1.6%
η 40
 
1.6%
ά 40
 
1.6%
ε 40
 
1.6%
Other values (14) 183
 
7.5%
CJK
ValueCountFrequency (%)
801
16.8%
801
16.8%
801
16.8%
566
11.9%
438
9.2%
347
7.3%
347
7.3%
219
 
4.6%
219
 
4.6%
广 219
 
4.6%
Cyrillic
ValueCountFrequency (%)
с 669
29.1%
к 383
16.7%
и 360
15.7%
й 340
14.8%
у 319
13.9%
а 37
 
1.6%
р 33
 
1.4%
н 22
 
1.0%
ь 22
 
1.0%
У 22
 
1.0%
Other values (12) 92
 
4.0%
Hangul
ValueCountFrequency (%)
366
16.7%
366
16.7%
366
16.7%
366
16.7%
366
16.7%
366
16.7%
Arabic
ValueCountFrequency (%)
ا 162
15.3%
ر 162
15.3%
ة 129
12.2%
ي 129
12.2%
ع 129
12.2%
ب 129
12.2%
ل 129
12.2%
س 18
 
1.7%
ی 18
 
1.7%
ف 18
 
1.7%
Other values (5) 38
 
3.6%
Thai
ValueCountFrequency (%)
128
28.6%
64
14.3%
64
14.3%
64
14.3%
64
14.3%
64
14.3%
Hebrew
ValueCountFrequency (%)
ִ 96
25.0%
ר 48
12.5%
ת 48
12.5%
י 48
12.5%
ע 48
12.5%
ְ 48
12.5%
ב 48
12.5%
Devanagari
ValueCountFrequency (%)
ि 82
16.7%
82
16.7%
82
16.7%
82
16.7%
82
16.7%
82
16.7%
Latin Ext Additional
ValueCountFrequency (%)
33
50.0%
ế 33
50.0%
Telugu
ValueCountFrequency (%)
24
33.3%
12
16.7%
12
16.7%
12
16.7%
12
16.7%
Bengali
ValueCountFrequency (%)
14
40.0%
7
20.0%
7
20.0%
7
20.0%
Georgian
ValueCountFrequency (%)
9
14.3%
9
14.3%
9
14.3%
9
14.3%
9
14.3%
9
14.3%
9
14.3%
Gurmukhi
ValueCountFrequency (%)
7
16.7%
7
16.7%
7
16.7%
7
16.7%
7
16.7%
7
16.7%
Tamil
ValueCountFrequency (%)
7
20.0%
7
20.0%
ி 7
20.0%
7
20.0%
7
20.0%
Sinhala
ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

TagLine
Text

MISSING 

Distinct7530
Distinct (%)99.2%
Missing2413
Missing (%)24.1%
Memory size156.2 KiB
2023-12-10T10:07:45.080304image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length206
Median length143
Mean length39.759457
Min length3

Characters and Unicode

Total characters301655
Distinct characters100
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7479 ?
Unique (%)98.6%

Sample

1st rowThey'll die when they're dead.
2nd rowJustice knows no borders.
3rd rowNeon lights... Suits with shoulder pads... Jumping from explosions in slow motion...
4th rowWe all share the same fate.
5th rowAttempting to survive in the middle of nowhere is her only option.
ValueCountFrequency (%)
the 3549
 
6.4%
a 1988
 
3.6%
to 1204
 
2.2%
of 1165
 
2.1%
is 1124
 
2.0%
you 967
 
1.7%
in 791
 
1.4%
and 586
 
1.1%
for 579
 
1.0%
one 502
 
0.9%
Other values (6347) 43096
77.6%
2023-12-10T10:07:45.516331image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
47972
15.9%
e 31978
 
10.6%
t 18698
 
6.2%
o 18615
 
6.2%
a 16132
 
5.3%
n 15291
 
5.1%
r 14757
 
4.9%
i 14757
 
4.9%
s 14165
 
4.7%
h 12027
 
4.0%
Other values (90) 97263
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 222152
73.6%
Space Separator 47982
 
15.9%
Uppercase Letter 16471
 
5.5%
Other Punctuation 13766
 
4.6%
Decimal Number 867
 
0.3%
Dash Punctuation 259
 
0.1%
Final Punctuation 115
 
< 0.1%
Close Punctuation 14
 
< 0.1%
Open Punctuation 14
 
< 0.1%
Currency Symbol 8
 
< 0.1%
Other values (4) 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 31978
14.4%
t 18698
 
8.4%
o 18615
 
8.4%
a 16132
 
7.3%
n 15291
 
6.9%
r 14757
 
6.6%
i 14757
 
6.6%
s 14165
 
6.4%
h 12027
 
5.4%
l 9614
 
4.3%
Other values (24) 56118
25.3%
Uppercase Letter
ValueCountFrequency (%)
T 2658
16.1%
A 1464
 
8.9%
S 1187
 
7.2%
W 974
 
5.9%
I 957
 
5.8%
H 942
 
5.7%
B 806
 
4.9%
N 753
 
4.6%
F 745
 
4.5%
E 724
 
4.4%
Other values (16) 5261
31.9%
Other Punctuation
ValueCountFrequency (%)
. 9252
67.2%
' 1727
 
12.5%
, 1224
 
8.9%
! 886
 
6.4%
? 419
 
3.0%
101
 
0.7%
" 63
 
0.5%
: 37
 
0.3%
% 17
 
0.1%
* 13
 
0.1%
Other values (4) 27
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 254
29.3%
1 184
21.2%
2 103
11.9%
9 64
 
7.4%
3 60
 
6.9%
5 49
 
5.7%
7 42
 
4.8%
6 41
 
4.7%
8 36
 
4.2%
4 34
 
3.9%
Dash Punctuation
ValueCountFrequency (%)
- 247
95.4%
8
 
3.1%
4
 
1.5%
Space Separator
ValueCountFrequency (%)
47972
> 99.9%
  10
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
113
98.3%
2
 
1.7%
Initial Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
Math Symbol
ValueCountFrequency (%)
~ 1
50.0%
+ 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 8
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 1
100.0%
Other Number
ValueCountFrequency (%)
½ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 238623
79.1%
Common 63032
 
20.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 31978
13.4%
t 18698
 
7.8%
o 18615
 
7.8%
a 16132
 
6.8%
n 15291
 
6.4%
r 14757
 
6.2%
i 14757
 
6.2%
s 14165
 
5.9%
h 12027
 
5.0%
l 9614
 
4.0%
Other values (50) 72589
30.4%
Common
ValueCountFrequency (%)
47972
76.1%
. 9252
 
14.7%
' 1727
 
2.7%
, 1224
 
1.9%
! 886
 
1.4%
? 419
 
0.7%
0 254
 
0.4%
- 247
 
0.4%
1 184
 
0.3%
113
 
0.2%
Other values (30) 754
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 301398
99.9%
Punctuation 231
 
0.1%
None 25
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
47972
15.9%
e 31978
 
10.6%
t 18698
 
6.2%
o 18615
 
6.2%
a 16132
 
5.4%
n 15291
 
5.1%
r 14757
 
4.9%
i 14757
 
4.9%
s 14165
 
4.7%
h 12027
 
4.0%
Other values (72) 97006
32.2%
Punctuation
ValueCountFrequency (%)
113
48.9%
101
43.7%
8
 
3.5%
4
 
1.7%
2
 
0.9%
2
 
0.9%
1
 
0.4%
None
ValueCountFrequency (%)
  10
40.0%
é 4
 
16.0%
ñ 2
 
8.0%
ü 2
 
8.0%
á 2
 
8.0%
ō 1
 
4.0%
ê 1
 
4.0%
ù 1
 
4.0%
í 1
 
4.0%
½ 1
 
4.0%
Modifier Letters
ValueCountFrequency (%)
ʻ 1
100.0%

RunTime
Real number (ℝ)

ZEROS 

Distinct220
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.7372
Minimum0
Maximum400
Zeros137
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:45.664616image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile63
Q191
median101
Q3115
95-th percentile140
Maximum400
Range400
Interquartile range (IQR)24

Descriptive statistics

Standard deviation27.703785
Coefficient of variation (CV)0.27230732
Kurtosis6.3812347
Mean101.7372
Median Absolute Deviation (MAD)12
Skewness-0.41509431
Sum1017372
Variance767.49969
MonotonicityNot monotonic
2023-12-10T10:07:45.809787image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 295
 
2.9%
95 283
 
2.8%
100 275
 
2.8%
93 269
 
2.7%
105 245
 
2.5%
97 240
 
2.4%
98 238
 
2.4%
94 228
 
2.3%
92 227
 
2.3%
96 226
 
2.3%
Other values (210) 7474
74.7%
ValueCountFrequency (%)
0 137
1.4%
2 3
 
< 0.1%
3 8
 
0.1%
4 6
 
0.1%
5 9
 
0.1%
6 14
 
0.1%
7 10
 
0.1%
8 8
 
0.1%
9 8
 
0.1%
10 13
 
0.1%
ValueCountFrequency (%)
400 1
< 0.1%
333 1
< 0.1%
317 1
< 0.1%
254 1
< 0.1%
248 1
< 0.1%
247 1
< 0.1%
242 2
< 0.1%
240 1
< 0.1%
238 2
< 0.1%
237 1
< 0.1%

Revenue
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct5579
Distinct (%)55.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62253720
Minimum0
Maximum2.923706 × 109
Zeros4155
Zeros (%)41.5%
Negative0
Negative (%)0.0%
Memory size156.2 KiB
2023-12-10T10:07:45.963535image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3720364
Q354054618
95-th percentile3.0726891 × 108
Maximum2.923706 × 109
Range2.923706 × 109
Interquartile range (IQR)54054618

Descriptive statistics

Standard deviation1.5623949 × 108
Coefficient of variation (CV)2.5097214
Kurtosis55.356164
Mean62253720
Median Absolute Deviation (MAD)3720364
Skewness5.8965705
Sum6.225372 × 1011
Variance2.4410779 × 1016
MonotonicityNot monotonic
2023-12-10T10:07:46.106289image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4155
41.5%
11000000 11
 
0.1%
2000000 10
 
0.1%
10000000 10
 
0.1%
12000000 10
 
0.1%
30000000 9
 
0.1%
7000000 8
 
0.1%
25000000 8
 
0.1%
5000000 7
 
0.1%
8000000 7
 
0.1%
Other values (5569) 5765
57.6%
ValueCountFrequency (%)
0 4155
41.5%
1 1
 
< 0.1%
3 1
 
< 0.1%
7 1
 
< 0.1%
10 1
 
< 0.1%
29 1
 
< 0.1%
43 1
 
< 0.1%
94 1
 
< 0.1%
126 1
 
< 0.1%
201 1
 
< 0.1%
ValueCountFrequency (%)
2923706026 1
< 0.1%
2800000000 1
< 0.1%
2320250281 1
< 0.1%
2264162353 1
< 0.1%
2068223624 1
< 0.1%
2052415039 1
< 0.1%
1921847111 1
< 0.1%
1671537444 1
< 0.1%
1663075401 1
< 0.1%
1518815515 1
< 0.1%

Interactions

2023-12-10T10:07:37.110815image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.039482image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.870258image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.703129image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.506809image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.322561image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:36.106452image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:37.233691image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.166239image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.990945image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.817926image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.623208image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.437384image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:36.225549image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:37.355500image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.286372image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.112423image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.938826image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.742949image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.554129image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:36.348498image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:37.467955image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.397171image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.230082image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.044986image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.856824image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.662796image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:36.461967image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:37.586448image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.515809image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.351506image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.157893image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.973350image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.775177image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:36.585952image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:37.696370image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.626127image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.460748image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.261459image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.080196image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.878573image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:36.696054image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:37.818650image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:32.753650image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:33.585173image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:34.387104image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.203981image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:35.994466image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2023-12-10T10:07:36.818643image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Correlations

2023-12-10T10:07:46.222615image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
IdPopularityVoteAverageVoteCountBudgetRunTimeRevenueOriginalLanguage
Id1.0000.082-0.135-0.510-0.456-0.212-0.4950.147
Popularity0.0821.0000.1610.3430.2450.0780.2780.000
VoteAverage-0.1350.1611.0000.3220.0630.3190.1630.117
VoteCount-0.5100.3430.3221.0000.6970.3650.7470.000
Budget-0.4560.2450.0630.6971.0000.3730.7940.017
RunTime-0.2120.0780.3190.3650.3731.0000.3870.123
Revenue-0.4950.2780.1630.7470.7940.3871.0000.000
OriginalLanguage0.1470.0000.1170.0000.0170.1230.0001.000

Missing values

2023-12-10T10:07:38.013879image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T10:07:38.299292image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-10T10:07:38.503507image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

GenreIdsIdOriginalLanguageOriginalTitleOverviewPopularityReleaseDateTitleVoteAverageVoteCountBudgetProductionCompaniesProductionCountriesSpokenLanguagesTagLineRunTimeRevenue
0[28, 12, 53]299054enExpend4blesArmed with every weapon they can get their hands on and the skills to use them, The Expendables are the world’s last line of defense and the team that gets called when all other options are off the table. But new team members with new styles and tactics are going to give “new blood” a whole new meaning.3741.0622023-09-15Expend4bles6.4364100000000[{'id': 1020, 'logo_path': '/kuUIHNwMec4dwOLghDhhZJzHZTd.png', 'name': 'Millennium Media', 'origin_country': 'US'}, {'id': 48738, 'logo_path': None, 'name': 'Campbell Grobman Films', 'origin_country': 'US'}, {'id': 1632, 'logo_path': '/cisLn1YAUuptXVBa0xjq7ST9cH0.png', 'name': 'Lionsgate', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]They'll die when they're dead.10330000000
1[28, 53, 80]926393enThe Equalizer 3Robert McCall finds himself at home in Southern Italy but he discovers his friends are under the control of local crime bosses. As events turn deadly, McCall knows what he has to do: become his friends' protector by taking on the mafia.2471.5152023-08-30The Equalizer 37.3102770000000[{'id': 1423, 'logo_path': '/1rbAwGQzrNvXDICD6HWEn1YqfAV.png', 'name': 'Escape Artists', 'origin_country': 'US'}, {'id': 5, 'logo_path': '/wrweLpBqRYcAM7kCSaHDJRxKGOP.png', 'name': 'Columbia Pictures', 'origin_country': 'US'}, {'id': 10400, 'logo_path': '/9LlB2YAwXTkUAhx0pItSo6pDlkB.png', 'name': 'Eagle Pictures', 'origin_country': 'IT'}, {'id': 44967, 'logo_path': None, 'name': 'ZHIV Productions', 'origin_country': ''}][{'iso_3166_1': 'IT', 'name': 'Italy'}, {'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Italian', 'iso_639_1': 'it', 'name': 'Italiano'}]Justice knows no borders.109176933602
2[16, 28, 14]1034062enMortal Kombat Legends: Cage MatchIn 1980s Hollywood, action star Johnny Cage is looking to become an A-list actor. But when his costar, Jennifer, goes missing from set, Johnny finds himself thrust into a world filled with shadows, danger, and deceit. As he embarks on a bloody journey, Johnny quickly discovers the City of Angels has more than a few devils in its midst.2223.4302023-10-17Mortal Kombat Legends: Cage Match7.8270[{'id': 2785, 'logo_path': '/l5zW8jjmQOCx2dFmvnmbYmqoBmL.png', 'name': 'Warner Bros. Animation', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]Neon lights... Suits with shoulder pads... Jumping from explosions in slow motion...800
3[28, 53]575264enMission: Impossible - Dead Reckoning Part OneEthan Hunt and his IMF team embark on their most dangerous mission yet: To track down a terrifying new weapon that threatens all of humanity before it falls into the wrong hands. With control of the future and the world's fate at stake and dark forces from Ethan's past closing in, a deadly race around the globe begins. Confronted by a mysterious, all-powerful enemy, Ethan must consider that nothing can matter more than his mission—not even the lives of those he cares about most.2032.9272023-07-08Mission: Impossible - Dead Reckoning Part One7.71799291000000[{'id': 4, 'logo_path': '/gz66EfNoYPqHTYI4q9UEN4CbHRc.png', 'name': 'Paramount', 'origin_country': 'US'}, {'id': 82819, 'logo_path': '/gXfFl9pRPaoaq14jybEn1pHeldr.png', 'name': 'Skydance', 'origin_country': 'US'}, {'id': 21777, 'logo_path': None, 'name': 'TC Productions', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'French', 'iso_639_1': 'fr', 'name': 'Français'}, {'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Italian', 'iso_639_1': 'it', 'name': 'Italiano'}, {'english_name': 'Russian', 'iso_639_1': 'ru', 'name': 'Pусский'}]We all share the same fate.164567148955
4[53, 18]1151534esNowhereA young pregnant woman named Mia escapes from a country at war by hiding in a maritime container aboard a cargo ship. After a violent storm, Mia gives birth to the child while lost at sea, where she must fight to survive.1627.6782023-09-29Nowhere7.66860[{'id': 204005, 'logo_path': None, 'name': 'Rock & Ruz', 'origin_country': 'ES'}][{'iso_3166_1': 'ES', 'name': 'Spain'}][{'english_name': 'Spanish', 'iso_639_1': 'es', 'name': 'Español'}]Attempting to survive in the middle of nowhere is her only option.1090
5[27, 9648, 53]968051enThe Nun IIIn 1956 France, a priest is violently murdered, and Sister Irene begins to investigate. She once again comes face-to-face with a powerful evil.1594.5592023-09-06The Nun II7.0108638500000[{'id': 12, 'logo_path': '/mevhneWSqbjU22D1MXNd4H9x0r0.png', 'name': 'New Line Cinema', 'origin_country': 'US'}, {'id': 76907, 'logo_path': '/ygMQtjsKX7BZkCQhQZY82lgnCUO.png', 'name': 'Atomic Monster', 'origin_country': 'US'}, {'id': 11565, 'logo_path': None, 'name': 'The Safran Company', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'French', 'iso_639_1': 'fr', 'name': 'Français'}]Confess your sins.110262010000
6[28, 80, 53]961268ko발레리나Grieving the loss of a best friend she couldn't protect, an ex-bodyguard sets out to fulfill her dear friend's last wish: sweet revenge.1521.0752023-10-05Ballerina7.02000[{'id': 127541, 'logo_path': '/Aq35mXuZv7lhPm8a60YKRaB9Vek.png', 'name': 'Climax Studios', 'origin_country': 'KR'}][{'iso_3166_1': 'KR', 'name': 'South Korea'}][{'english_name': 'Korean', 'iso_639_1': 'ko', 'name': '한국어/조선말'}]Merciless and ruthless, to hell.930
7[27, 53]951491enSaw XBetween the events of 'Saw' and 'Saw II', a sick and desperate John Kramer travels to Mexico for a risky and experimental medical procedure in hopes of a miracle cure for his cancer, only to discover the entire operation is a scam to defraud the most vulnerable. Armed with a newfound purpose, the infamous serial killer returns to his work, turning the tables on the con artists in his signature visceral way through devious, deranged, and ingenious traps.1469.1772023-09-26Saw X7.328713000000[{'id': 2061, 'logo_path': '/o9LbN33hRaj4qcebUv1bikyXeoB.png', 'name': 'Twisted Pictures', 'origin_country': 'US'}, {'id': 1632, 'logo_path': '/cisLn1YAUuptXVBa0xjq7ST9cH0.png', 'name': 'Lionsgate', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Spanish', 'iso_639_1': 'es', 'name': 'Español'}]Witness the return of Jigsaw.11871984243
8[12, 28, 18]980489enGran TurismoThe ultimate wish-fulfillment tale of a teenage Gran Turismo player whose gaming skills won him a series of Nissan competitions to become an actual professional racecar driver.1315.5182023-08-09Gran Turismo8.1112760000000[{'id': 125281, 'logo_path': '/3hV8pyxzAJgEjiSYVv1WZ0ZYayp.png', 'name': 'PlayStation Productions', 'origin_country': 'US'}, {'id': 84792, 'logo_path': '/7Rfr3Zu6QnHpXW2VdSEzUminAQd.png', 'name': '2.0 Entertainment', 'origin_country': 'US'}, {'id': 5, 'logo_path': '/wrweLpBqRYcAM7kCSaHDJRxKGOP.png', 'name': 'Columbia Pictures', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Japanese', 'iso_639_1': 'ja', 'name': '日本語'}, {'english_name': 'German', 'iso_639_1': 'de', 'name': 'Deutsch'}]From gamer to racer.135114800000
9[53, 878, 28]937249en57 SecondsWhen a tech blogger lands an interview with a tech guru and stops an attack on him, he finds a mysterious ring that takes him back 57 seconds into the past.1304.9782023-09-2957 Seconds5.41110[{'id': 189103, 'logo_path': '/hu0qcD4k7kfWpdAewqmJSUyZPa7.png', 'name': 'Ashland Hill Media Finance', 'origin_country': 'US'}, {'id': 176331, 'logo_path': None, 'name': 'BGG Capital', 'origin_country': ''}, {'id': 12029, 'logo_path': '/8PAf5K4VVI6xO9SjB7bxLtpi4xH.png', 'name': 'Curmudgeon Films', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]Rewind the past. Avenge the future.990
GenreIdsIdOriginalLanguageOriginalTitleOverviewPopularityReleaseDateTitleVoteAverageVoteCountBudgetProductionCompaniesProductionCountriesSpokenLanguagesTagLineRunTimeRevenue
9990[9648, 80, 53]11092enPresumed InnocentRusty Sabich is a deputy prosecutor engaged in an obsessive affair with a coworker who is murdered. Soon after, he's accused of the crime. And his fight to clear his name becomes a whirlpool of lies and hidden passions.13.0541990-07-27Presumed Innocent6.859522000000[{'id': 932, 'logo_path': None, 'name': 'Mirage Enterprises', 'origin_country': 'US'}, {'id': 174, 'logo_path': '/IuAlhI9eVC9Z8UQWOIDdWRKSEJ.png', 'name': 'Warner Bros. Pictures', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]Some people would kill for love127221303188
9991[18, 10770, 36]664423enThe Windermere ChildrenThe story of the pioneering project to rehabilitate child survivors of the Holocaust on the shores of Lake Windermere.13.0522020-01-27The Windermere Children7.5960[{'id': 16049, 'logo_path': '/2hb0R2GJ6BvgwmyqO7fIgtjsakt.png', 'name': 'Wall to Wall', 'origin_country': 'GB'}, {'id': 109759, 'logo_path': '/na8NZKqFc8Ep9qAg8KQUcH1DpQl.png', 'name': 'Warner Bros. International Television Production Germany', 'origin_country': 'DE'}, {'id': 4606, 'logo_path': '/otZHbf1HmzLRQsZFSqJXkf8EHz7.png', 'name': 'ZDF', 'origin_country': 'DE'}, {'id': 11667, 'logo_path': '/dhBJdstAolGmqRfbQfElhsU1cTo.png', 'name': 'Northern Ireland Screen', 'origin_country': 'GB'}][{'iso_3166_1': 'DE', 'name': 'Germany'}, {'iso_3166_1': 'GB', 'name': 'United Kingdom'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'German', 'iso_639_1': 'de', 'name': 'Deutsch'}]NaN880
9992[27, 878]3077enSon of FrankensteinOne of the sons of late Dr. Henry Frankenstein finds his father's ghoulish creation in a coma and revives him, only to find out the monster is controlled by Ygor who is bent on revenge.13.0521939-01-13Son of Frankenstein6.7205420000[{'id': 33, 'logo_path': '/8lvHyhjr8oUKOOy2dKXoALWKdp0.png', 'name': 'Universal Pictures', 'origin_country': 'US'}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]The black shadows of the past bred this half-man . . . half-demon ! . . . creating a new and terrible juggernaut of destruction !990
9993[18, 10749]413543hiDear ZindagiAn unconventional thinker helps a budding cinematographer gain a new perspective on life.13.0512016-11-23Dear Zindagi7.12104300000[{'id': 2343, 'logo_path': '/fkrlAFxgAtIHgYJGIQcDggeucKV.png', 'name': 'Red Chillies Entertainment', 'origin_country': 'IN'}, {'id': 19146, 'logo_path': '/5Ff25ornzVNhm5skuAvMAR556NB.png', 'name': 'Dharma Productions', 'origin_country': 'IN'}, {'id': 78597, 'logo_path': None, 'name': 'Hope Productions', 'origin_country': 'IN'}][{'iso_3166_1': 'IN', 'name': 'India'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Hindi', 'iso_639_1': 'hi', 'name': 'हिन्दी'}]NaN1513376375
9994[12, 18, 28, 53]14400frLargo WinchAfter a powerful billionaire is murdered, his secret adoptive son must race to prove his legitimacy, find his father's killers and stop them from taking over his financial empire.13.0512008-12-17The Heir Apparent: Largo Winch6.048625412760[{'id': 6750, 'logo_path': None, 'name': 'Pan-Européenne', 'origin_country': 'FR'}, {'id': 856, 'logo_path': '/3tfzS2CrX6Ntbu927XzHXEPDA6y.png', 'name': 'Wild Bunch', 'origin_country': 'FR'}, {'id': 356, 'logo_path': '/tSJvuFaLIp7l0ONLUiAHA61GbXu.png', 'name': 'TF1 Films Production', 'origin_country': 'FR'}, {'id': 139617, 'logo_path': None, 'name': 'Araneo', 'origin_country': 'BE'}, {'id': 104, 'logo_path': '/9aotxauvc9685tq9pTcRJszuT06.png', 'name': 'Canal+', 'origin_country': 'FR'}, {'id': 14362, 'logo_path': None, 'name': 'October Pictures', 'origin_country': 'HK'}][{'iso_3166_1': 'BE', 'name': 'Belgium'}, {'iso_3166_1': 'FR', 'name': 'France'}, {'iso_3166_1': 'HK', 'name': 'Hong Kong'}][{'english_name': 'French', 'iso_639_1': 'fr', 'name': 'Français'}, {'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Serbian', 'iso_639_1': 'sr', 'name': 'Srpski'}, {'english_name': 'Portuguese', 'iso_639_1': 'pt', 'name': 'Português'}]NaN1080
9995[28, 80, 53]2749en15 MinutesWhen Eastern European criminals Oleg and Emil come to New York City to pick up their share of a heist score, Oleg steals a video camera and starts filming their activities, both legal and illegal. When they learn how the American media circus can make a remorseless killer look like the victim and make them rich, they target media-savvy NYPD Homicide Detective Eddie Flemming and media-naive FDNY Fire Marshal Jordy Warsaw, the cops investigating their murder and torching of their former criminal partner, filming everything to sell to the local tabloid TV show "Top Story."13.0512001-03-0115 Minutes5.964660000000[{'id': 376, 'logo_path': None, 'name': 'Industry Entertainment', 'origin_country': ''}, {'id': 11391, 'logo_path': '/t6m0uRTzaFHCsvEpikENE0PWJGb.png', 'name': 'Tribeca Productions', 'origin_country': 'US'}, {'id': 67171, 'logo_path': None, 'name': 'New Redemption Pictures', 'origin_country': ''}][{'iso_3166_1': 'DE', 'name': 'Germany'}, {'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}, {'english_name': 'Greek', 'iso_639_1': 'el', 'name': 'ελληνικά'}, {'english_name': 'Russian', 'iso_639_1': 'ru', 'name': 'Pусский'}, {'english_name': 'Czech', 'iso_639_1': 'cs', 'name': 'Český'}]America Likes to Watch12056359980
9996[18, 28, 53]11128enLadder 49Under the watchful eye of his mentor, Captain Mike Kennedy, probationary firefighter Jack Morrison matures into a seasoned veteran at a Baltimore fire station. However, Jack has reached a crossroads as the sacrifices he's made have put him in harm's way innumerable times and significantly impacted his relationship with his wife and kids.13.0502004-10-01Ladder 496.470760000000[{'id': 919, 'logo_path': None, 'name': 'Beacon Communications', 'origin_country': ''}, {'id': 9195, 'logo_path': '/ou5BUbtulr6tIt699q6xJiEQTR9.png', 'name': 'Touchstone Pictures', 'origin_country': 'US'}, {'id': 10157, 'logo_path': None, 'name': 'Beacon Pictures', 'origin_country': ''}, {'id': 877, 'logo_path': None, 'name': 'Casey Silver Productions', 'origin_country': 'US'}, {'id': 53987, 'logo_path': None, 'name': 'Fantail Films Inc.', 'origin_country': ''}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]Their greatest challenge lies in rescuing one of their own11574541707
9997[18, 35]484482frLe Grand Bain40-year-old Bertrand has been suffering from depression for the last two years and is barely able to keep his head above water. Despite the medication he gulps down all day, every day, and his wife's encouragement, he is unable to find any meaning in his life. Curiously, he will end up finding this sense of purpose at the swimming pool, by joining an all-male synchronised swimming team.13.0492018-10-24Sink or Swim6.913980[{'id': 34780, 'logo_path': '/ahgzFRXKF2fkOV0WfB0RCxEmxki.png', 'name': 'Chi-Fou-Mi Productions', 'origin_country': 'FR'}, {'id': 2612, 'logo_path': '/3ulBLchjjnjVmvyLQSwO62MDKLE.png', 'name': 'Les Productions du Trésor', 'origin_country': 'FR'}][{'iso_3166_1': 'BE', 'name': 'Belgium'}, {'iso_3166_1': 'FR', 'name': 'France'}][{'english_name': 'French', 'iso_639_1': 'fr', 'name': 'Français'}]NaN1220
9998[18]453755enArcticA man stranded in the Arctic is finally about to receive his long awaited rescue. However, after a tragic accident, his opportunity is lost and he must then decide whether to remain in the relative safety of his camp or embark on a deadly trek through the unknown for potential salvation.13.0492018-11-21Arctic6.510952000000[{'id': 35849, 'logo_path': '/bHYIJoy2ri7crfHugwR0AdF3qdM.png', 'name': 'Armory Films', 'origin_country': 'US'}, {'id': 12496, 'logo_path': '/tk6stgFevcJ2iWyd0IaEfVStNqL.png', 'name': 'Pegasus Pictures', 'origin_country': 'IS'}, {'id': 21775, 'logo_path': None, 'name': 'Union Entertainment Group', 'origin_country': ''}, {'id': 120207, 'logo_path': None, 'name': 'The Domain Group', 'origin_country': ''}, {'id': 12142, 'logo_path': '/rPnEeMwxjI6rYMGqkWqIWwIJXxi.png', 'name': 'XYZ Films', 'origin_country': 'US'}][{'iso_3166_1': 'IS', 'name': 'Iceland'}, {'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'Danish', 'iso_639_1': 'da', 'name': 'Dansk'}, {'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]Survival is the only option984100000
9999[10402, 99, 10751]54518enJustin Bieber: Never Say NeverTells the story of Justin Bieber, the kid from Canada with the hair, the smile and the voice: It chronicles his unprecedented rise to fame, all the way from busking in the streets of Stratford, Canada to putting videos on YouTube to selling out Madison Square Garden in New York as the headline act during the My World Tour from 2010. It features Usher, Scooter Braun, Ludacris, Sean Kingston, Antonio "L.A." Reid, Boyz II Men, Miley Cyrus, Jaden Smith, Justin's family members and parts of his crew and huge fanbase in a mix of interviews and guest performances. It was released in 3D in theaters all around the world and is the highest grossing concert movie of all time, beating the previous record held by Michael Jackson's This Is It from 2009.13.0492011-02-11Justin Bieber: Never Say Never5.237813000000[{'id': 7377, 'logo_path': '/zdVYfWyiQmLgyq8V73HRHpCWjRO.png', 'name': 'Insurge Pictures', 'origin_country': 'US'}, {'id': 7378, 'logo_path': None, 'name': 'Magical Elves', 'origin_country': 'US'}, {'id': 7379, 'logo_path': '/gwpgBGRP4nXn3M6gXUV6W02uKGS.png', 'name': 'Scooter Braun Films', 'origin_country': 'US'}, {'id': 4, 'logo_path': '/gz66EfNoYPqHTYI4q9UEN4CbHRc.png', 'name': 'Paramount', 'origin_country': 'US'}, {'id': 162509, 'logo_path': None, 'name': 'L.A. Reid Media', 'origin_country': ''}, {'id': 87041, 'logo_path': None, 'name': 'AEG Live', 'origin_country': ''}, {'id': 746, 'logo_path': '/kc7bdIVTBkJYy9aDK1QDDTAL463.png', 'name': 'MTV Films', 'origin_country': 'US'}, {'id': 78156, 'logo_path': None, 'name': 'The Island Def Jam Music Group', 'origin_country': ''}][{'iso_3166_1': 'US', 'name': 'United States of America'}][{'english_name': 'English', 'iso_639_1': 'en', 'name': 'English'}]Find out what's possible if you never give up.10598500000